MMSE Feature Reconstruction Based on an Occlusion Model for Robust ASR

نویسندگان

  • José A. González
  • Antonio M. Peinado
  • Ángel M. Gómez
چکیده

This paper proposes a novel compensation technique developed in the log-spectral domain. Our proposal consists in a minimum mean square error (MMSE) estimator derived from an occlusion model [1]. According to this model, the effect of noise over speech is simplified to a binary masking, so that the noise is completely masked by the speech when the speech power dominates and the other way round when the noise is dominant. As for many MMSE-based techniques, a statistical model of clean speech is required. A Gaussian mixture model is employed here. The resulting technique has clear similarities with missing-data imputation techniques although, unlike these ones, an explicit model of noise is employed by our proposal. The experimental results show the superiority of our MMSE estimator with respect to missing-data imputation with both binary and soft masks.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robust MMSE-FW-LA ASR S

In this paper, a novel feature weight (FW) algorithm for robust automatic speech recognition (ASR) is proposed. In this algorithm every feature will be weighted according to their credible probability, especially, the weight factors are formulated and obtained from the gain coefficients generated as by-products of speech enhancement based on minimum mean square error (MMSE) estimation. Moreover...

متن کامل

Integration of DNN based speech enhancement and ASR

Speech enhancement employing Deep Neural Networks (DNNs) is gaining strength as a data-driven alternative to classical Minimum Mean Square Error (MMSE) enhancement approaches. In the past, Observation Uncertainty approaches to integrate MMSE speech enhancement with Automatic Speech Recognition (ASR) have yielded good results as a lightweight alternative for robust ASR. In this paper we thus exp...

متن کامل

Kalman and unscented kalman filter feature enhancement for noise robust ASR

Model-based feature enhancement is an ASR front-end technique to increase the robustness of the recogniser in noisy environments. However, its MMSE-estimates of the clean speech feature vectors are based only on the static components at the current frame. In this paper, we show how the Kalman filter framework can be seen as a natural extension that incorporates both the current and the previous...

متن کامل

Mask estimation in non-stationary noise environments for missing feature based robust speech recognition

In missing feature based automatic speech recognition (ASR), the role of the spectro-temporal mask in providing an accurate description of the relationship between target speech and environmental noise is critical for minimizing the degradation in ASR word accuracy (WAC) as the signal-to-noise ratio (SNR) decreases. This paper demonstrates the importance of accurate characterization of instanta...

متن کامل

Log-spectral feature reconstruction based on an occlusion model for noise robust speech recognition

This paper addresses the problem of feature compensation in the log-spectral domain for speech recognition in noise by recasting the speech distortion problem as an occlusion one. The usual non-linear mismatch function that represents the speech distortion due to additive noise can be reasonably well approximated by the maximum of the two mixing sources (speech and noise). Using this approximat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012